Adding Syntax to Dynamic Programming for Aligning Comparable Texts for the Generation of Paraphrases
نویسندگان
چکیده
Multiple sequence alignment techniques have recently gained popularity in the Natural Language community, especially for tasks such as machine translation, text generation, and paraphrase identification. Prior work falls into two categories, depending on the type of input used: (a) parallel corpora (e.g., multiple translations of the same text) or (b) comparable texts (non-parallel but on the same topic). So far, only techniques based on parallel texts have successfully used syntactic information to guide alignments. In this paper, we describe an algorithm for incorporating syntactic features in the alignment process for non-parallel texts with the goal of generating novel paraphrases of existing texts. Our method uses dynamic programming with alignment decision based on the local syntactic similarity between two sentences. Our results show that syntactic alignment outrivals syntax-free methods by 20% in both grammaticality and fidelity when computed over the novel sentences generated by alignment-induced finite state automata.
منابع مشابه
An Optimal Tax Relief Policy with Aligning Markov Chain and Dynamic Programming Approach
Abstract In this paper, Markov chain and dynamic programming were used to represent a suitable pattern for tax relief and tax evasion decrease based on tax earnings in Iran from 2005 to 2009. Results, by applying this model, showed that tax evasion were 6714 billion Rials**. With 4% relief to tax payers and by calculating present value of the received tax, it was reduced to 3108 billion Rials. ...
متن کاملExtracting Lay Paraphrases of Specialized Expressions from Monolingual Comparable Medical Corpora
Whereas multilingual comparable corpora have been used to identify translations of words or terms, monolingual corpora can help identify paraphrases. The present work addresses paraphrases found between two different discourse types: specialized and lay texts. We therefore built comparable corpora of specialized and lay texts in order to detect equivalent lay and specialized expressions. We ide...
متن کاملAligning Predicates across Monolingual Comparable Texts using Graph-based Clustering
Generating coherent discourse is an important aspect in natural language generation. Our aim is to learn factors that constitute coherent discourse from data, with a focus on how to realize predicate-argument structures in a model that exceeds the sentence level. We present an important subtask for this overall goal, in which we align predicates across comparable texts, admitting partial argume...
متن کاملReverse Engineering of Network Software Binary Codes for Identification of Syntax and Semantics of Protocol Messages
Reverse engineering of network applications especially from the security point of view is of high importance and interest. Many network applications use proprietary protocols which specifications are not publicly available. Reverse engineering of such applications could provide us with vital information to understand their embedded unknown protocols. This could facilitate many tasks including d...
متن کاملStochastic Short-Term Hydro-Thermal Scheduling Based on Mixed Integer Programming with Volatile Wind Power Generation
This study addresses a stochastic structure for generation companies (GenCoʼs) that participate in hydro-thermal self-scheduling with a wind power plant on short-term scheduling for simultaneous reserve energy and energy market. In stochastic scheduling of HTSS with a wind power plant, in addition to various types of uncertainties such as energy price, spinning /non-spinning reserve prices, unc...
متن کامل